An Improved Inter-Intra Contrastive Learning Framework on Self-Supervised Video Representation
نویسندگان
چکیده
In this paper, we propose a self-supervised contrastive learning method to learn video feature representations. traditional methods, constraints from anchor, positive, and negative data pairs are used train the model. such case, different samplings of same treated as positives, clips videos negatives. Because spatio-temporal information is important for representation, set temporal more strictly by introducing intra-negative samples. addition samples videos, extended breaking relations in anchor video. With proposed Inter-Intra Contrastive (IIC) framework, can convolutional networks representations videos. Strong augmentations, residual clips, well head projector utilized construct an improved version. Three kinds generation functions extensive experiments using network backbones conducted on benchmark datasets. Without pre-computed optical flow data, our version outperform previous IIC large margin, 19.4% (from 36.8% 56.2%) 5.2% 15.5% 20.7%) points improvements top-1 accuracy UCF101 HMDB51 datasets retrieval, respectively. For recognition, over 3% also be obtained these two Discussions visualizations validate that IICv2 capture better clues indicate potential mechanism.
منابع مشابه
Time-Contrastive Networks: Self-Supervised Learning from Video
We propose a self-supervised approach for learning representations and robotic behaviors entirely from unlabeled videos recorded from multiple viewpoints, and study how this representation can be used in two robotic imitation settings: imitating object interactions from videos of humans, and imitating human poses. Imitation of human behavior requires a viewpoint-invariant representation that ca...
متن کاملAn Improved Motion Vector Estimation Approach for Video Error Concealment Based on the Video Scene Analysis
In order to enhance the accuracy of the motion vector (MV) estimation and also reduce the error propagation issue during the estimation, in this paper, a new adaptive error concealment (EC) approach is proposed based on the information extracted from the video scene. In this regard, the motion information of the video scene around the degraded MB is first analyzed to estimate the motion type of...
متن کاملVideo-Based Person Re-Identification by Simultaneously Learning Intra-Video and Inter-Video Distance Metrics
Video-based person re-identification (re-id) is an important application in practice. However, only a few methods have been presented for this problem. Since large variations exist between different pedestrian videos, as well as within each video, it’s challenging to conduct re-identification between pedestrian videos. In this paper, we propose a simultaneous intra-video and inter-video distanc...
متن کاملAn Improved Semi-Supervised Clustering Algorithm Based on Active Learning
In semi supervised clustering is one of the major tasks and aims at grouping the data objects into meaningful classes (clusters) such that the similarity of objects within clusters is maximized and the similarity of objects between clusters is minimized. The dataset sometimes may be in mixed nature that is it may consist of both numeric and categorical type of data. Naturally these two types of...
متن کاملAn Improved Semi-supervised Clustering Algorithm Based on Active Learning
In order to solve the difficult questions such as in the presence of the cluster deviation and high dimensional data processing in traditional semi-supervised clustering algorithm, a semi-supervised clustering algorithm based on active learning was proposed, this algorithm can effectively solve the above two problems. Using active learning strategies in algorithm can obtain a large amount of in...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Circuits and Systems for Video Technology
سال: 2022
ISSN: ['1051-8215', '1558-2205']
DOI: https://doi.org/10.1109/tcsvt.2022.3141051